LOGIC CIRCUIT AND METHOD FOR CARRY AND SUM GENERATION 
AND METHOD OF DESIGNING SUCH A LOGIC CIRCUIT 



5 This application claims priority under 35 U.S.C. 1 19(e) from U.S. Provisional 
Application Serial No. 60/436,179 filed December 23, 2002, the specification of 
which is incorporated herein by reference and made a part hereof. 

The present invention relates to a method and apparatus for use in logic circuits, and 
10 in particular, to a method and apparatus for generating a carry or sum bit by 
combining two binary inputs. 

Addition of two binary numbers is a fundamental operation used in many electronic 
circuits. For example, binary addition is used in integer arithmetic-logic units, and 
1 5 also, all the floating-point operations use integer addition in their calculations. 
Memory accesses require integer addition for address generation, branches use 
addition for forming instruction addresses, and for making greater-than or less-than 
comparisons. Thus, many modern circuits contain several integer adders, many of 
which may appear on frequency-limiting paths. 

20 

In an addition of two numbers, the digit in each column of the first number is added 
to the digit in the corresponding column of the second number, and any carry digit 
resulting from the previous column is also added, in order to obtain the value of the 
sum in each column. Thus, for two n-bit binary numbers a = a n _i . . .ajao and b = b n . 
25 ] . . .bibo, their sum is the n+1 bit number given by s = s n . . .SjSo, where: 

Sj = ai © bj © Ci 

Cj+i = ajbi + Ci(ai + bO 

30 



M&C Folio: USP86939X 
SL WK 1 365. 063 US J 



where c k is the carry into position k, + denotes logical OR, proximity denotes 
Logical AND and © denotes Exclusive OR. 



The carry bit into any chosen column can be generated from two logical functions 
5 called Generate and Propagate. The bit level Generate function gj indicates whether a 
carry is generated by a particular column in the addition. The function gj is true if a 
carry is generated at column i. The bit-level propagate function pi indicates whether 
any carry for a particular column will be propagated on to the next column. The 
function p* is true if carry into column i is propagated into column The bit level 
10 generate and propagate functions can be constructed from the bits in column i of the 
two numbers to be added, as follows: 

gi = aibj 
Pi = a* + bj 

15 

Thus, in the addition of a = a n -i . . .aiao and b = b n -i . . .b|bo, the carry into the j+1 'th 
column is given by: 

G j:0 = Cj+i = gj + Pjg j_i + PjPj-lgj-2 + . . . + PjPj-1 • • -PlgO 

20 

Figure 1 shows an implementation of a circuit to generate G j: o based on the above 
equation. However, the circuit of figure 1 is not a practical circuit to realize for large 
values of j. It is an OR of j+1 AND-terms, the largest of the AND gates also having 
j+1 inputs. Moreover, the fan-out of the p's is very large, pj having a fan-out of j. 

25 

High speed practical implementations realize the carry function in a tree like 
structure. A prior art method is known as parallel prefix and will now be illustrated 
(S Knowles, "A Family of Adders", Proc, 14 lh IEEE Symp. On Computer 
Arithmetic, pp30-44, 1999). The parallel prefix method uses bit-level generate and 
30 propagate functions to construct Group Generate and Group Propagate functions. 



M&C Folio: USP86939X 
SLWK 1365.063 US I 



Gj:k = gj + Pjg j-1 + PjPj-lgj-2 + . . . + PjPj-1 • • -Pk+lgk 
Pj:k = PjPj-1 - pk+lPk 



The function Gj :k is true if the group of bits from k to j generates a carry and the 
5 function Pj* is true if the group of bits from k to j propagates a carry coming into 
that group into the next group. 

The parallel prefix method uses Group Generate and Group Propagate functions of 
smaller sized groups to construct the Group Generate and Group Propagate functions 
10 of a larger group. A large group of bits from i to j is divided into 2 groups say from i 
to k-1 and k to j. The larger group generates a carry if either the most significant 
group generates a carry or the least significant group generates a carry and the most 
significant group propagates this carry. This is illustrated in figure 2. In logical 
notation this can be expressed as. 

15 

Gj:0 = Gj : k + Pj : k Gk-l:0 



The Group Propagate function of a large group can be constructed from Group 
Propagate functions of smaller groups: 

20 

Pj:i = Pj:k Pk-I:i 



These two constructions allow the Group Generate of a larger group to be formed 
recursively from smaller groups, which themselves are formed from even smaller 
25 groups and so on. 

This method allows for the construction of G j: j in l~log2(j - i)l levels, once the bit- 
level generate and propagate functions have been formed. 



30 It is possible to form the Group Generate of a large group in fewer levels still. If the 
large group i to j is divided into 3 groups say, i to k'-l, k' to k"-l, and k" to j then: 
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Gj:i - Gj :k " + Pj:k" Gk"-l:k* + Pj:k" Pk"-l:k' Gk'-l:i 



The drawback of this method is that although fewer combining levels are needed, the 
5 gates at each combining level are more complex and the fan-out on the Group 

Generate and Group Propagate functions increases. Both of these impact heavily on 
the delay of the circuit. This situation is further exasperated when all the carries for 
an adder need to be constructed. 

10 The following is an example of the parallel prefix method for a 9-bit addition, using 
base 3. A circuit diagram for this example is shown in figure 3. 

Given two 9-bit numbers a = aga7. . .aiao and b = bgb7. . .bibo, we form 3-bit groups 
aga7a6, a5a4a3, a2aiao for a and bsb7b6, bsb^, b2bibo for b. 

15 

Then the generate and propagate functions for each group are 

G8:6 = g8 + P8g7 + p8p7g6, P8:6 = p8P7p6 
G 5 :3 = g5 + P5g4 + P5P4g3, Ps:3 = P5p4P3 
20 G 2 ;0 = g2 + P2gl + P2PlgO, P2:0 = P2P1PO 

These Group functions are now combined to form: 

Gs:0 = G 8: 6 + P8:6G 5: 3 + P8:6P5:3G2:0 

25 

The other carries could be constructed in the following manner: 

G7:0 = G7:6 + P7:6Gs : 3 + P?:6P5:3G 2 :0 
G6:0 = G6:6 + P6:6Gs:3 + P6:6P5:3G2:0 
30 G 5 :0 = G 5 :3 + P5:3G2:0 
G5:0 = Gs:3 + P5:3G2;0 
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G 4: 0 = G 4:3 + P4:3G2:0 
G 2 :0 = g2 + P2gl +P2plgO 
Gi:0 = gl + plgO 
G 0 :0 = gO 

5 

An improved prior art technique for determining the carry bits is the Ling method 
(H. Ling, "High Speed Binary Adder", IBM Journal of Research and Development, 
Vol 25, No 3, ppl56-166, 1981). Ling observed a variation of the above, which 
allows for a small speed up on the parallel prefix method. He observed that if the 
10 delay of the carry term G j:i could be reduced by increasing the delay of some other 
term, the overall delay will be reduced as long as the carry term is still on the critical 
path. Ling observed that every term in 

Gj:i = gj + Pjgj-1 + PjPj-lgj-2 + . . . + PjPj-1 • • -Pi+lgi 

15 

contains pj except for the very first term, which is simply gj. However, G j: i can still 
be simplified by noting that 

gk = Pkgk 

20 

Therefore pj can be factored out of Gj :i to create a pseudocarry Hj :i , where 

G j:i = PjH j:i 
Hj^gj+Gj.,:* 

25 

The function H j:i is a little simpler than the function Gj :i . The fan-in of the OR gate 
for Hp and Gj : i is the same but the fan-in of each AND-gate is reduced by 1 . This is 
illustrated in Figure 4. Ling also observed that the pseudocarry Hp of a large group 
could be constructed from the pseudocarries Hj :k and H k _j :i of smaller groups: 

30 

H j:i = gj + Gj-l:i = gj + Gj.i;k + Pj-l:k G k -] : I 
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= [gj + Gj.,: k ] + Pj. 1:k p k -l[gk-l + G k . 2:i ] 
= Hj : k +Pj-I:k-1 H k -l:i 

This provides a method for constructing the pseudocarry of a large group in terms of 
5 pseudocarries of smaller groups, which can be constructed from the pseudocarries of 
yet still smaller groups. 

As in the parallel prefix case more than two pseudocarries can be combined to form 
the pseudo carry of a large group: 

10 

If the large group i to j is divided into 3 groups say, i to k'-l, k' to k"-l, and k" to j 
then: 

Gj:i = PjH j:i 

15 

Gj-i;i = Gj.| : k" + Pj-l:k" Gk"-l:k' + Pj-l:k" Pk"-l:k> Gk'-l:i 

Gk"-l:k' - Pr'-lHk-.hk' 
Gk'-l:i = Pk'-lH k '_i:i 

20 

Hj:i = Hj : k" + Pj-l:k"-lHk"-l:k' + Pj-l:k"-lPk M -2:k , -lHk'-l:i 

This method still suffers the same problems as the parallel prefix method, that is, 
more complex gates. Note that Hj :i has the form H 2 + P2H1 + P2P1H0, which is 
25 exactly the same as that of the Group generate function G2 + P2G1 + P2P1G0 in the 
parallel prefix method, and higher fan-out is the also the same. Ling's method will 
now be illustrated by way of example. 

The following is an example of a 9-bit Ling adder, which is illustrated in figure 5a. 

30 

G 8:0 = G 8:6 + P8:6Gs:3 + P8:6P5:3G 2: o = P8 H 8:0 
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H 8:0 = H 8:6 + P7: 5 H 5 :3 + Pt.-sP^^O 



The pseudocarry functions are: 

5 H 8:6 =g8 + g7 + P7g6 
H 5:3 =g5 + g4 + P4g3 
H 2 :0 =g2 + gl +PlgO 

Note that at the first level, the highest complexity function for Ling has the form H 2 
10 + Hi + Pi Ho, where as for parallel prefix this is G 2 + P 2 G! + P 2 PiG 0 . 

But the complexity of Hs : o is the same as Gs : o, both being of the form A + BC + 
DEF. One may try to combine ?y $ P4 2 and thus reduce the complexity of the second 
level to A + BC + DE, but 

15 

P7:5 P4:2 = ?7:2 = P7 P6 P5 P4 P3 P2 

which is an AND of 6 terms and generally slower to calculate than 

20 H 2:0 = g 2 + g 1 +pig 0 

The Ling adder does have the problem that to produce the actual cany out the logical 
AND of pj and H p needs to be formed which would impact the delay. This extra 
delay can however be eliminated by noting that the critical path for a n-bit adder is in 
25 producing the n- 1 th bit which can be expressed as: 

S n -i = a n -i © b n -t © G n - 2: o 
= a n -i © b n -i © Pn- 2 H n . 2: o 

30 But p n - 2 can be computed faster than H n . 2: o and so a multiplexer can be used. This is 
shown in Figure 5b. 
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S n -1 - (a n -l © b n -1 © Pn-2) H n -2:0 + (a n -l © b n -l) H n .2:Q C 



Although Ling's method is better than the parallel prefix method, it nevertheless has 
5 a number of shortcomings. It parallelizes the computation of Gj :i as PjH j:i5 but one of 
the functions, pj, is a very simple bit level propagate while the other function, Hj :i , is 
much more complex and so the parallelization is very limited. This parallelization, 
Gj:i = PjHp cannot be extended to more than two functions, that is no method is 
provided to parallelize Gp as XYZ etc. Ling's method allows for the speed of the 
10 first level only (compared to the parallel prefix method) and even this is very limited 
allowing for at most a reduction in the fan-in of the AND gates at the first level by at 
most 1. It offers no advantage over parallel prefix method when combining Group 
functions, in terms of the complexity of the gates and the fan out of Group functions. 

1 5 The first drawback of Ling's approach is that although the carry function G j: j = pjHj :i 
is broken down as a combination of two simpler functions, which can be computed 
in parallel, one of the functions is a very simple pj = aj + bj while the second is much 
more complex. Thus the impact on the delay in calculating the carry is very small. 

20 A further prior art technique for generating carry bits is described in US patent 

number 5,964,827 (IBM Corporation). The IBM technique involves generating G3 :0 
by factorising p 3 p 2 out of the expression for G 3: o. The result is: 

G 3 :0 = g3 + P3p2[g2 + gl + Plgo] = [g3 + P3P2][g3 + g2 + gl + Plgo] 

25 

The function Gis : o is then determined using a similar factorisation involving a group 
function, giving: 

Gi5:0 = [G|5:12 + P 1 5: 1 2P 1 ] :8] [G 1 5: 1 2 + G) i : g + G7* + P 3 :()G3 : o]. 

30 
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The IBM method provides the advantage that the above factorisation reduces all 
AND gates to only two inputs. This is particularly useful in dynamic logic 
implementations because AND gates slow down significantly as the number of 
inputs is increased. Thus, the aim of the IBM idea is to reduce the number of inputs 
5 to a minimum for each AND gate. This can be achieved by combining only four bits 
at each level to produce a group generate function or a carry, and performing the 
above factorisation, in which each AND gate has only two inputs. In this type of 
technology, it is not as crucial to limit the number of inputs on an OR gate. However 
in the IBM method, the generate function is fully calculated at each stage by 
10 performing an AND operation between the two terms in brackets. This is 
unnecessary, and slows down the circuit. 

The present invention uses reduced generate logic which is simpler logic than the 
generate logic i.e. less logic is required and the computation is faster. The output of 

15 generate logic indicates if a carry will be generated out of a group of input bits. The 
output of reduced generate logic for a group of input bits, partitioned into at least one 
most significant bit, and at least one least significant bit, is the logical OR of a 
generate logic for the least significant bits and logic for performing a function X for 
the most significant bits. X represents a function which is high if a carry is generated 

20 out of the most significant bits, low if no carry is generated at any bit position in the 
most significant bits, and in a don't care state if a carry is generated at some bit 
position in the most significant bits but no carry is generated out of the most 
significant bits. 

25 One aspect of the present invention provides a method and apparatus for forming 
reduced generate logic for a group of input bits using at least one reduced generate 
output for at least one subgroup of the group of input bits, at least one reduced 
generate logic generating an output based on an X function using at least two most 
significant input bits. 

30 
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One aspect of the present invention provides a method and apparatus for carry 
generation in which logic is arranged in levels of logic in which each level computes 
reduced generate functions, and lower levels compute reduced generate functions 
from reduced generate functions at higher levels, wherein at least one of the reduced 
5 generate functions has an X component ranging over at least two bits. The levels are 
preferable levels in a tree structure. 

Another aspect provides a logic circuit for generation of a carry bit output by 
combining two sets of binary inputs, the logic circuit comprising first logic for 

10 receiving a plurality of bits of the binary inputs and for generating at least one 

intermediate output; final logic for receiving at least one intermediate output of the 
first logic and for generating the carry bit output; wherein said final logic is arranged 
to generate the carry bit output using a reduced generate function for a group of bits 
of the binary inputs and at least one intermediate output from said first logic at least 

1 5 one of which is generated as a reduced generate function of a sub-group of bits of the 
binary inputs; wherein a reduced generate function for a group of bits, partitioned 
into at least one most significant bit and at least one least significant bit, is the 
logical OR of a generate function for the least significant bits and a function X for 
the most significant bits, where the generate function is high if a carry is generated 

20 out of the least significant bits and low if not, and X is a function which is high if a 
carry is generated out of the most significant bits, low if no carry is generated at any 
bit position in the most significant bits, and in a don't care state if a carry is 
generated at some bit position in the most significant bits but no carry is generated 
out of the most significant bits; and wherein first logic and/or said final logic is 

25 arranged to use a reduced generate function in which the group or sub-group of bits 
of the binary inputs is partitioned so that said at least one most significant bit 
comprises at least two most significant bits. 

Another aspect provides a logic circuit for generation of a sum bit output by 
30 combining two sets of binary inputs, the logic circuit comprising first logic for 
receiving a plurality of bits of the binary inputs and for generating at least one 
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intermediate output; final logic for receiving at least one intermediate output of the 
first logic and for generating the sum bit output; wherein said final logic is arranged 
to generate the sum bit output using a reduced generate function for a group of bits 
of the binary inputs and at least one intermediate output from said first logic at least 
5 one of which is generated as a reduced generate function of a sub-group of bits of the 
binary inputs; wherein a reduced generate function for a group of bits, partitioned 
into at least one most significant bit and at least one least significant bit, is the 
logical OR of a generate function for the least significant bits and a function X for 
the most significant bits, where the generate function is high if a carry is generated 

10 out of the least significant bits and low if not, and X is a function which is high if a 
cany is generated out of the most significant bits, low if no carry is generated at any 
bit position in the most significant bits, and in a don't care state if a carry is 
generated at some bit position in the most significant bits but no carry is generated 
out of the most significant bits; and wherein first logic and/or said final logic is 

1 5 arranged to use a reduced generate function in which the group or sub- group of bits 
of the binary inputs is partitioned so that said at least one most significant bit 
comprises at least two most significant bits. 

In this aspect of the present invention, the sum bit is calculated directly using the 
20 reduced generate function, rather than generating the carry and logically exclusive 
OR combining the carry bit with the exclusive OR combination of input bits. In one 
embodiment the final logic includes at least one multiplexer. 

Another aspect provides a logic circuit for generation of a carry bit output by 
25 combining two sets of binary inputs, the logic circuit comprising a first level of logic 
comprising a plurality of logic units, each logic unit for receiving a plurality of bits 
of the binary inputs and for generating an intermediate output; at least one further 
level of logic including a final level of logic for receiving outputs of at least one 
previous level of logic and comprising at least one logic unit for receiving the 
30 intermediate outputs from at least one logic unit of at least one previous level and for 
generating an intermediate output; and output logic for generating the carry bit 
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output using at least one of the intermediate outputs from the final level of logic; 
wherein at least one logic unit of at least one level of logic is arranged to generate an 
intermediate output as a reduced generate function for a group of bits of the binary 
inputs using intermediate outputs from at least one higher level at least one of which 
5 is generated as a reduced generate function of a sub-group of bits of the binary 

inputs; wherein an intermediate output generated as a reduced generate function for a 
group or sub-group of bits, partitioned into at least one most significant bit and at 
least one least significant bit, is the logical OR of a generate function for the least 
significant bits and a function X for the most significant bits, where the generate 

10 function is high if a carry is generated out of the least significant bits and low if not, 
and X is a function which is high if a carry is generated out of the most significant 
bits, low if no carry is generated at any bit position in the most significant bits, and 
in a don't care state if a carry is generated at some bit position in the most significant 
bits but no carry is generated out of the most significant bits; and wherein at least 

15 one of said at least one logic unit of at least one level of logic is arranged to generate 
an intermediate output as a reduced generate function in which the group or sub- 
group of bits of the binary inputs for said at least one logic unit is partitioned so that 
said at least one most significant bit comprises at least two most significant bits. 

20 In one embodiment further logic is provided for generating an output for a group of 
most significant bits of the binary inputs which is high if a carry is generated out of 
the group or if all of the bit level propagate bits for the group are high, wherein said 
output logic is arranged to generate the carry bit as a function of the logical AND of 
the output of said further logic and the intermediate output of said final level 

25 generated as a reduced generate function for a group of bits. 

A second aspect provides a logic circuit for generation of a sum bit output by 
combining two sets of binary inputs, the logic circuit comprising a first level of logic 
comprising a plurality of logic units, each logic unit for receiving a plurality of bits 
30 of the binary inputs and for generating an intermediate output; at least one further 
level of logic including a final level of logic for receiving outputs of at least one 



MAC Folio: USP86939X 
SLWK 1365.063USI 



12 



previous level of logic and comprising at least one logic unit for receiving the 
intermediate outputs from at least one logic unit of at least one previous level and for 
generating an intermediate output; and output logic for generating the sum bit output 
using at least one of the intermediate outputs from the final level of logic; wherein at 
5 least one logic unit of at least one level of logic is arranged to generate an 

intermediate output as a reduced generate function for a group of bits of the binary 
inputs using intermediate outputs from at least one higher level at least one of which 
is generated as a reduced generate function of a sub-group of bits of the binary 
inputs; wherein an intermediate output generated as a reduced generate function for a 

10 group or sub-group of bits, partitioned into at least one most significant bit and at 
least one least significant bit, is the logical OR of a generate function for the least 
significant bits and a function X for the most significant bits, where the generate 
function is high if a carry is generated out of the least significant bits and low if not, 
and X is a function which is high if a carry is generated out of the most significant 

15 bits, low if no carry is generated at any bit position in the most significant bits, and 
in a don't care state if a carry is generated at some bit position in the most significant 
bits but no carry is generated out of the most significant bits; and wherein at least 
one of said at least one logic unit of at least one level of logic is arranged to generate 
an intermediate output as a reduced generate function in which the group or sub- 

20 group of bits of the binary inputs for said at least one logic unit is partitioned so that 
said at least one most significant bit comprises at least two most significant bits. 

In this aspect of the present invention, the sum bit is calculated directly using the 
reduced generate function, rather than generating the carry and logically exclusive 
25 OR combining the carry bit with the exclusive OR combination of input bits. In one 
embodiment the output logic comprises a multiplexer. 

In one embodiment further logic is provided for generating an output for a group of 
most significant bits of the binary inputs which is high if a carry is generated out of 
30 the group or if all of the bit level propagate bits for the group are high, wherein said 
output logic is arranged to generate the carry bit as a function of the logical AND of 
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the output of said further logic and the intermediate output of said final level 
generated as a reduced generate function for a group of bits. 

Another aspect provides a logic circuit for generation of a carry bit output by 
5 combining two sets of binary inputs, the logic circuit comprising a first level of logic 
comprising a plurality of logic units, each logic unit for receiving a plurality of bits 
of the binary inputs and for generating an intermediate output; at least one further 
level of logic for receiving outputs of at least one previous level of logic and 
comprising at least one logic unit for receiving the intermediate outputs from at least 

10 one logic unit of the at least one previous level and for generating an intermediate 
output; a final level of logic for receiving at least one of the intermediate outputs of 
at least one previous level of logic and comprising at least one logic unit for 
receiving the intermediate outputs from at least one logic unit of the at least one 
previous level of logic and for generating the carry bit output; wherein at least one 

1 5 logic unit of at least one of said further levels of logic is arranged to generate an 
intermediate output as a reduced generate function for a group of bits of the binary 
inputs using intermediate outputs from at least one higher level, at least one of said 
intermediate outputs being generated as a reduced generate function of a sub-group 
of bits of the binary inputs; wherein an intermediate output generated as a reduced 

20 generate function for a group or sub-group of bits, partitioned into at least one most 
significant bit and at least one least significant bit, is the logical OR of a generate 
function for the least significant bits and a function X for the most significant bits, 
where the generate function is high if a carry is generated out of the least significant 
bits and low if not, and X is a function which is high if a carry is generated out of the 

25 most significant bits, low if no carry is generated at any bit position in the most 

significant bits, and in a don't care state if a carry is generated at some bit position in 
the most significant bits but no carry is generated out of the most significant bits; and 
wherein at least one of said at least one logic unit of at least one of said first or 
further levels of logic is arranged to generate an intermediate output as a reduced 

30 generate function in which the group or sub-group of bits of the binary inputs for 
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said at least one logic unit is partitioned so that said at least one most significant bit 
comprises at least two most significant bits. 

Another aspect provides a logic circuit for generation of a sum bit output by 
5 combining two sets of binary inputs, the logic circuit comprising a first level of logic 
comprising a plurality of logic units, each logic unit for receiving a plurality of bits 
of the binary inputs and for generating an intermediate output; at least one further 
level of logic for receiving at least one of the intermediate outputs of at least one 
previous level of logic and comprising at least one logic unit for receiving the 

10 intermediate outputs from at least one logic unit of the at least one previous level and 
for generating an intermediate output; a final level of logic for receiving outputs of at 
least one previous level of logic and comprising at least one logic unit for receiving 
the intermediate outputs from at least one logic unit of the at least one previous level 
of logic and for generating the sum bit output; wherein at least one logic unit of at 

15 least one of said further levels of logic is arranged to generate an intermediate output 
as a reduced generate function for a group of bits of the binary inputs using 
intermediate outputs from at least one higher level, at least one of said intermediate 
outputs being generated as a reduced generate function of a sub-group of bits of the 
binary inputs; wherein an intermediate output generated as a reduced generate 

20 function for a group or sub-group of bits, partitioned into at least one most 

significant bit and at least one least significant bit, is the logical OR of a generate 
function for the least significant bits and a function X for the most significant bits, 
where the generate function is high if a carry is generated out of the least significant 
bits and low if not, and X is a function which is high if a carry is generated out of the 

25 most significant bits, low if no carry is generated at any bit position in the most 

significant bits, and in a don't care state if a carry is generated at some bit position in 
the most significant bits but no carry is generated out of the most significant bits; and 
wherein at least one of said at least one logic unit of at least one of said first or 
further levels of logic is arranged to generate an intermediate output as a reduced 

30 generate function in which the group or sub-group of bits of the binary inputs for 
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said at least one logic unit is partitioned so that said at least one most significant bit 
comprises at least two most significant bits. 

In this aspect of the present invention, the sum bit is calculated directly using the 
5 reduced generate function, rather than generating the carry and logically exclusive 
OR combining the carry bit with the exclusive OR combination of input bits. In one 
embodiment the final level of logic includes at least one multiplexer. 

Another aspect provides a logic circuit for generation of a carry bit output by 

10 combining two sets of binary inputs, the logic circuit comprising first logic 

comprising a plurality of logic units, each logic unit for receiving a plurality of bits 
of the binary inputs and for generating an intermediate output; final logic for 
receiving at least one intermediate output of the first logic and comprising at least 
one logic unit for receiving at least one intermediate output from at least one logic 

1 5 unit of the first logic and for generating the carry bit output; wherein at least one 
logic unit of at least one of said first logic is arranged to generate an intermediate 
output as a reduced generate function for a group of bits of the binary inputs; 
wherein an intermediate output generated as a reduced generate function for a group 
of bits, partitioned into at least one most significant bit and at least one least 

20 significant bit, is the logical OR of a generate function for the least significant bits 
and a function X for the most significant bits, where the generate function is high if a 
carry is generated out of the least significant bits and low if not, and X is a function 
which is high if a carry is generated out of the most significant bits, low if no carry is 
generated at any bit position in the most significant bits, and in a don't care state if a 

25 carry is generated at some bit position in the most significant bits but no carry is 
generated out of the most significant bits; and wherein at least one of said at least 
one logic unit of said first logic is arranged to generate an intermediate output for 
receipt by said final logic as a reduced generate function in which the group of bits 
of the binary inputs for said at least one logic unit is partitioned so that said at least 

30 one most significant bit comprises at least two most significant bits. 
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Another aspect provides a logic circuit for generation of a sum bit output by 
combining two sets of binary inputs, the logic circuit comprising first logic 
comprising a plurality of logic units, each logic unit for receiving a plurality of bits 
of the binary inputs and for generating an intermediate output; final logic for 
5 receiving at least one intermediate output of the first logic and comprising at least 
one logic unit for receiving at least one intermediate output from at least one logic 
unit of the first logic and for generating the sum bit output; wherein at least one logic 
unit of at least one of said first logic is arranged to generate an intermediate output as 
a reduced generate function for a group of bits of the binary inputs; wherein an 

10 intermediate output generated as a reduced generate function for a group of bits, 
partitioned into at least one most significant bit and at least one least significant bit, 
is the logical OR of a generate function for the least significant bits and a function X 
for the most significant bits, where the generate function is high if a carry is 
generated out of the least significant bits and low if not, and X is a function which is 

15 high if a carry is generated out of the most significant bits, low if no carry is 

generated at any bit position in the most significant bits, and in a don't care state if a 
carry is generated at some bit position in the most significant bits but no carry is 
generated out of the most significant bits; and wherein at least one of said at least 
one logic unit of said first logic is arranged to generate an intermediate output for 

20 receipt by said final logic as a reduced generate function in which the group of bits 
of the binary inputs for said at least one logic unit is partitioned so that said at least 
one most significant bit comprises at least two most significant bits. 

In this aspect of the present invention, the sum bit is calculated direcdy using the 
25 reduced generate function, rather than generating the carry and logically exclusive 
OR combining the carry bit with the exclusive OR combination of input bits. In one 
embodiment the final logic includes at least one multiplexer. 

Another aspect of the present invention provides a logic circuit for generation of a 
30 carry bit output by combining two sets of binary inputs, the logic circuit comprising: 
bit level carry generate and propagate function logic for receiving the binary inputs 
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and for generating bit level carry generate and propagate function bits for said binary 
inputs by respectively logically AND and OR combining respective bits of said 
binary inputs; first logic for receiving bit level carry generate and propagate function 
bits for a first group of at least three most significant bits of said binary inputs to 
5 generate a high output if a carry is generated out of the first group of most significant 
bits of said binary input or if said carry propagate function bits for the most 
significant bits are all high; second logic for receiving bit level carry generate and 
propagate function bits for said binary inputs to generate a high output if any of said 
carry generate function bits for the most significant bits are high or if a carry is 
10 generated out of a second group of least significant bits of said binary input; and 

combining logic for generating the carry bit output by combining outputs of said first 
and second logic. 

Another aspect of the present invention provides a logic circuit for generation of a 
15 sum bit output by combining two sets of binary inputs, the logic circuit comprising: 
bit level carry generate and propagate function logic for receiving the binary inputs 
and for generating bit level carry generate and propagate function bits for said binary 
inputs by respectively logically AND and OR combining respective bits of said 
binary inputs; first logic for receiving bit level carry generate and propagate function 
20 bits for a first group of at least three most significant bits of said binary inputs to 

generate a high output if a carry is generated out of the first group of most significant 
bits of said binary input or if said carry propagate function bits for the most 
significant bits are all high; second logic for receiving bit level carry generate and 
propagate function bits for said binary inputs to generate a high output if any of said 
25 carry generate function bits for the most significant bits are high or if a carry is 
generated out of a second group of least significant bits of said binary input; and 
combining logic for generating the sum bit output by combining outputs of said first 
and second logic. 

30 In this aspect of the present invention, the sum bit is calculated directly rather than 
generating the carry and logically exclusive OR combining the carry bit with the 
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exclusive OR combination of input bits. In one embodiment the combining logic 
includes at least one multiplexer. 

In one embodiment of the present invention, the first logic comprises a plurality of 
5 first logic modules, each for receiving bit level carry generate and propagate function 
bits for subgroups of the first group of at least three most significant bits of the 
binary inputs to generate a high output if a carry is generated for the subgroup of 
most significant bits of the binary input or if the carry propagate function bits for the 
subgroup of most significant bits are all high. 

10 

In one embodiment, the second logic comprises a plurality of logic modules for 
receiving subgroups of the second group of least significant bits of the binary input 
to generate a carry for each of the subgroups and combining logic for combining the 
generated carrys. 

15 

Another aspect of the present invention provides a logic circuit for generation of a 
carry bit output by combining two sets of binary inputs, the logic circuit comprising: 
bit level carry generate and propagate function logic for receiving the binary inputs 
and for generating bit level carry generate and propagate function bits for said binary 

20 inputs by respectively logically AND and OR combining respective bits of said 

binary inputs; first logic for receiving bit level generate and propagate function bits 
for a first group of at least three most significant bits of said binary inputs to 
generate an output as a function of a logical OR combination of a carry bit output for 
the first group of most significant bits of said binary input and a result of a logical 

25 AND combination of propagate function bits for the most significant bits; second 
logic for receiving bit level generate and propagate function bits for said binary 
inputs to generate an output as a function of a result of a logical OR combination of a 
carry bit output for a group of least significant bits of said binary inputs and a 
function B which is high if a carry is generated at any bit position in the most 

30 significant bits; and combining logic for generating the carry bit output by 
combining outputs of said first and second logic. 
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Another aspect of the present invention provides a logic circuit for generation of a 
sum bit output by combining two sets of binary inputs, the logic circuit comprising: 
bit level carry generate and propagate function logic for receiving the binary inputs 
5 and for generating bit level carry generate and propagate function bits for said binary 
inputs by respectively logically AND and OR combining respective bits of said 
binary inputs; first logic for receiving bit level generate and propagate function bits 
for a first group of at least three most significant bits of said binary inputs to 
generate an output as a function of a logical OR combination of a carry bit output for 

10 the first group of most significant bits of said binary input and a result of a logical 
AND combination of propagate function bits for the most significant bits; second 
logic for receiving bit level generate and propagate function bits for said binary 
inputs to generate an output as a function of a result of a logical OR combination of a 
carry bit output for a group of least significant bits of said binary inputs and a 

1 5 function B which is high if a carry is generated at any bit position in the most 

significant bits; and combining logic for generating the sum bit output by combining 
outputs of said first and second logic. 

In this aspect of the present invention, the sum bit is calculated directly rather than 
20 generating the carry and logically exclusive OR combining the carry bit with the 
exclusive OR combination of input bits. In one embodiment the combining logic 
includes at least one multiplexer. 

Another aspect of the present invention provides a binary adder circuit comprising 
25 the logic circuit as hereinabove described, and including addition logic comprising 
exclusive OR logic and multiplexer for determining an addition result including the 
carry bit for the binary inputs 

Another aspect of the present invention provides a comparison logic circuit for 
30 comparing two binary inputs comprising the logic circuit as hereinabove described, 
and including logic for using the carry bit to indicate whether one binary input 
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represents a binary number less than or more than another binary number 
represented by the other binary input. 

The present invention also encompasses the use of reduced modified generate logic 
5 (D) which is simpler logic than modified generate logic. Modified generate logic 
indicates if a carry is generated out of the addition of inputs plus one. This enables 
the logic unit D to be broken down and computed in a parallel fashion. 

Another aspect provides a logic circuit for generation of a carry bit output by adding 

10 two sets of binary inputs plus one, the logic circuit comprising first logic for 
receiving a plurality of bits of the binary inputs and for generating at least one 
intermediate output; final logic for receiving at least one intermediate output of the 
first logic and for generating the carry bit output; wherein said final logic is arranged 
to generate the carry bit output using a reduced modified generate function for a 

15 group of bits of the binary inputs and at least one intermediate output from said first 
logic at least one of which is generated as a reduced generate function or a reduced 
modified generate function of a sub-group of bits of the binary inputs; wherein a 
reduced generate function for a group of bits, partitioned into at least one most 
significant bit and at least one least significant bit, is the logical OR of a generate 

20 function for the least significant bits and a function X for the most significant bits, 
where the generate function is high if a carry is generated out of the least significant 
bits and low if not, and X is a function which is high if a carry is generated out of the 
most significant bits, low if no carry is generated at any bit position in the most 
significant bits, and in a don't care state if a carry is generated at some bit position in 

25 the most significant bits but no carry is generated out of the most significant bits; 
wherein a reduced modified generate function is the logical OR of a modified 
generate function for the least significant bits and the function X for the most 
significant bits, where the modified generate function is high if a carry is generated 
on adding the least significant bits plus one and low if not; wherein said final logic is 

30 arranged to use a reduced modified generate function in which the group or sub- 
group of bits of the binary inputs is partitioned so that said at least one most 
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significant bit comprises at least two most significant bits and/or said first logic is 
arranged to generate at least one intermediate output as a reduced generate function 
or a reduced modified generate function in which the group or sub-group of bits of 
the binary inputs is partitioned so that said at least one most significant bit comprises 
5 at least two most significant bits. 

In another aspect of the present invention the sum bit for two inputs plus one can 
similarly be computed. 

10 In one embodiment the reduced modified generate function uses a hyper propagate 
function (PD) for the group of bits, the hyper propagate function comprises a logical 
AND combination of the modified generate function (D) for at least one least 
significant bit of the group of bits and a propagate function (P) for at least one most 
significant bit of the group of bits, and the propagate function is high if a carry into a 

15 group of bits would be propagated out of the group of bits. Thus in this embodiment 
the function D is parallelised. The hyper propagate function PD can be further 
parallelised by using at least one hyper propagate function for a sub-group of bits. 

Another aspect provides a logic circuit for generation of a carry bit output by 
20 combining two sets of binary inputs plus one, the logic circuit comprising a first 
level of logic comprising a plurality of logic units, each logic unit for receiving a 
plurality of bits of the binary inputs and for generating an intermediate output; at 
least one further level of logic including a final level of logic for receiving outputs of 
at least one previous level of logic and comprising at least one logic unit for 
25 receiving the intermediate outputs from at least one logic unit of at least one 
previous level and for generating an intermediate output; and output logic for 
generating the carry bit output using at least one intermediate output from the final 
level of logic; wherein at least one logic unit of at least one level of logic is arranged 
to generate an intermediate output as a reduced generate function or a reduced 
30 modified generate function for a group of bits of the binary inputs using intermediate 
outputs from at least one higher level at least one of which is generated as a reduced 
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generate function or reduced modified generate function group of a sub-group of bits 
of the binary inputs; wherein an intermediate output generated as a reduced generate 
function for a group or sub-group of bits, partitioned into at least one most 
significant bit and at least one least significant bit, is the logical OR of a generate 
5 function for the least significant bits and a function X for the most significant bits, 
where the generate function is high if a carry is generated out of the least significant 
bits and low if not, and X is a function which is high if a carry is generated out of the 
most significant bits, low if no carry is generated at any bit position in the most 
significant bits, and in a don't care state if a carry is generated at some bit position in 

10 the most significant bits but no carry is generated out of the most significant bits; 
wherein a reduced modified generate function is the logical OR of a modified 
generate function for the least significant bits and the function X for the most 
significant bits, where the modified generate function is high if a carry is generated 
on adding the least significant bits plus one and low if not; and wherein at least one 

15 of said at least one logic unit of at least one level of logic is arranged to generate an 
intermediate output as a reduced generate function or reduced modified generate 
function in which the group or sub-group of bits of the binary inputs for said at least 
one logic unit is partitioned so that said at least one most significant bit comprises at 
least two most significant bits. 

20 

In another aspect of the present invention the sum bit for two inputs plus one can 
similarly be computed. 

In one embodiment further logic is provided for generating an output for a group of 
25 most significant bits of the binary inputs which is high if a carry is generated out of 
the group or if all of the bit level propagate bits for the group are high, wherein said 
output logic is arranged to generate the carry bit as a function of the logical AND of 
the output of said further logic and the intermediate output of said final level 
generated as a reduced modified generate function for a group of bits. 

30 
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Another aspect provides a logic circuit for generation of a carry bit output by 
combining two sets of binary inputs plus one, the logic circuit comprising a first 
level of logic comprising a plurality of logic units, each logic unit for receiving a 
plurality of bits of the binary inputs and for generating an intermediate output; at 
5 least one further level of logic for receiving at least one intermediate output of at 
least one previous level of logic and comprising at least one logic unit for receiving 
the intermediate outputs from at least one logic unit of the at least one previous level 
and for generating an intermediate output; a final level of logic for receiving outputs 
of at least one previous level of logic and comprising at least one logic unit for 

10 receiving the intermediate outputs from at least one logic unit of the at least one 
previous level of logic and for generating the carry bit output; wherein at least one 
logic unit of at least one of said further levels of logic is arranged to generate an 
intermediate output as a reduced generate function or a reduced modified generate 
function for a group of bits of the binary inputs using at least one intermediate output 

15 from at least one higher level, at least one of said intermediate outputs being 

generated as a reduced generate function or a reduced modified generate function of 
a sub-group of bits of the binary inputs; wherein an intermediate output generated as 
a reduced generate function for a group or sub-group of bits, partitioned into at least 
one most significant bit and at least one least significant bit, is the logical OR of a 

20 generate function for the least significant bits and a function X for the most 

significant bits, where the generate function is high if a carry is generated out of the 
least significant bits and low if not, and X is a function which is high if a carry is 
generated out of the most significant bits, low if no carry is generated at any bit 
position in the most significant bits, and in a don't care state if a carry is generated at 

25 some bit position in the most significant bits but no carry is generated out of the 

most significant bits; wherein a reduced modified generate function is the logical OR 
of a modified generate function for the least significant bits and the function X for 
the most significant bits, where the modified generate function is high if a carry is 
generated on adding the least significant bits plus one and low if not; and wherein at 

30 least one of said at least one logic unit of at least one of said first or further levels of 
logic is arranged to generate an intermediate output as a reduced generate function or 
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reduced modified generate function in which the group or sub-group of bits of the 
binary inputs for said at least one logic unit is partitioned so that said at least one 
most significant bit comprises at least two most significant bits. 

5 In another aspect of the present invention the sum bit for two inputs plus one can 
similarly be computed. 

Another aspect provides a logic circuit for generation of a carry bit output by 
combining two sets of binary inputs plus one, the logic circuit comprising first logic 

1 0 comprising a plurality of logic units, each logic unit for receiving a plurality of bits 
of the binary inputs and for generating an intermediate output; final logic for 
receiving at least one intermediate output of the first logic and comprising at least 
one logic unit for receiving at least one intermediate output from at least one logic 
unit of the first logic and for generating the carry bit output; wherein at least one 

15 logic unit of at least one of said first logic is arranged to generate an intermediate 
output as a reduced generate function or a reduced modified generate function for a 
group of bits of the binary inputs; wherein an intermediate output generated as a 
reduced generate function for a group of bits, partitioned into at least one most 
significant bit and at least one least significant bit, is the logical OR of a generate 

20 function for the least significant bits and a function X for the most significant bits, 
where the generate function is high if a carry is generated out of the least significant 
bits and low if not, and X is a function which is high if a cany is generated out of the 
most significant bits, low if no carry is generated at any bit position in the most 
significant bits, and in a don't care state if a carry is generated at some bit position in 

25 the most significant bits but no carry is generated out of the most significant bits; 
wherein a reduced modified generate function is the logical OR of a modified 
generate function for the least significant bits and the function X for the most 
significant bits, where the modified generate function is high if a carry is generated 
on adding the least significant bits plus one and low if not; and wherein at least one 

30 of said at least one logic unit of said first logic is arranged to generate an 

intermediate output for receipt by said final logic as a reduced generate function or a 
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reduced modified generate function in which the group of bits of the binary inputs 
for said at least one logic unit is partitioned so that said at least one most significant 
bit comprises at least two most significant bits. 

5 In another aspect of the present invention the sum bit for two inputs plus one can 
similarly be computed. 

Another aspect of the present invention provides a logic circuit for generation of a 
carry bit output by combining two sets of binary inputs plus one, the logic circuit 

10 comprising: bit level carry generate and propagate function logic for receiving the 
binary inputs and for generating bit level carry generate and propagate function bits 
for said binary inputs by respectively logically AND and OR combining respective 
bits of said binary inputs; first logic for receiving bit level carry generate and 
propagate function bits for a first group of at least three most significant bits of said 

15 binary inputs to generate a high output if a carry is generated out of the first group of 
most significant bits of said binary input or if said carry propagate function bits for 
the most significant bits are all high; second logic for receiving bit level carry 
generate and propagate function bits for said binary inputs to generate a high output 
if any of said carry generate function bits for the most significant bits are high or if a 

20 carry is generated out of a second group of least significant bits plus one of said 
binary input; and combining logic for generating the carry bit output by combining 
outputs of said first and second logic. 

In another aspect of the present invention the sum bit for two inputs plus one can 
25 similarly be computed. 

In one embodiment of the present invention, the first logic comprises a plurality of 
first logic modules, each for receiving bit level carry generate and propagate function 
bits for subgroups of the first group of at least three most significant bits of the 
30 binary inputs to generate a high output if a cany is generated for the subgroup of 
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most significant bits of the binary input or if the carry propagate function bits for the 
subgroup of most significant bits are all high. 

In one embodiment, the second logic comprises a plurality of logic modules for 
5 receiving subgroups of the second group of least significant bits of the binary input 
to generate a carry for each of the subgroups and combining logic for combining the 
generated carrys. 

Another aspect of the present invention provides a logic circuit for generation of a 
10 carry bit output by combining two sets of binary inputs plus one, the logic circuit 
comprising: bit level carry generate and propagate function logic for receiving the 
binary inputs and for generating bit level carry generate and propagate function bits 
for said binary inputs by respectively logically AND and OR combining respective 
bits of said binary inputs; first logic for receiving bit level generate and propagate 
1 5 function bits for a first group of at least three most significant bits of said binary 
inputs to generate an output as a function of a logical OR combination of a carry bit 
output for the first group of most significant bits of said binary input and a result of a 
logical AND combination of propagate function bits for the most significant bits; 
second logic for receiving bit level generate and propagate function bits for said 
20 binary inputs to generate an output as a function of a result of a logical OR 

combination of a carry bit output for a group of least significant bits plus one of said 
binary inputs and a function B which is high if a carry is generated at any bit position 
in the most significant bits; and combining logic for generating the carry bit output 
by combining outputs of said first and second logic. 

25 

In another aspect of the present invention the sum bit for two inputs plus one can 
similarly be computed. 

Another aspect of the present invention provides a logic circuit for generation of a 
30 carry bit output by combining two sets of binary inputs, the logic circuit comprising 
logic for receiving a plurality of bits of the binary inputs and for generating the carry 
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bit output; wherein said logic is arranged to generate the carry bit output as the 
logical AND of a generate function G for at least one most significant bit, a reduced 
modified generate function for the said at least one most significant bit and at least 
one middle bit of the binary inputs and a reduced generate function for said at least 
5 one middle bit and at least one least significant bit of the binary inputs; wherein said 
reduced generate function is the logical OR of a generate function G for the at least 
one least significant bit and a function X for the at least one most significant bit and 
the at least one middle bit, where the generate function for the at least one least 
significant bit is high if a carry is generated out of the at least one least significant bit 

10 and low if not, and X is a function which is high if a carry is generated out of the at 
least one most significant bit and said at least one middle bit, low if no carry is 
generated at any bit position in the at least one most significant bit and said at least 
one middle bit, and in a don't care state if a carry is generated at some bit position in 
the at least one most significant bit and said at least one middle bit but no carry is 

15 generated out of the at least one most significant bit and said at least one middle bit; 
said reduced modified generate function is the logical OR of a modified generate 
function D for the at least one middle bit and the function X for the most significant 
bits, where the modified generate function D for the at least one middle bit is high if 
a carry is generated on adding the at least one middle bit plus one and low if not. 

20 

Another aspect of the present invention provides a logic circuit for generation of a 
carry bit output D by combining two sets of binary inputs plus 1, the logic circuit 
comprising logic for receiving a plurality of bits of the binary inputs and for 
generating the carry bit output; wherein said logic is arranged to generate the carry 

25 bit output as the logical AND of a modified generate function D for at least one most 
significant bit, a first reduced modified generate function for the said at least one 
most significant bit and at least one middle bit of the binary inputs and a second 
reduced modified generate function for said at least one middle bit and at least one 
least significant bit of the binary inputs; wherein said second reduced modified 

30 generate function is the logical OR of a modified generate function for the at least 
one least significant bit and a function X for the at least one most significant bit and 
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the at least one middle bit, where the modified generate function for the at least one 
least significant bit is high if a carry is generated out of the at least one least 
significant bit plus one and low if not, and X is a function which is high if a carry is 
generated out of the at least one most significant bit and said at least one middle bit, 
5 low if no carry is generated at any bit position in the at least one most significant bit 
and said at least one middle bit, and in a don't care state if a carry is generated at 
some bit position in the at least one most significant bit and said at least one middle 
bit but no carry is generated out of the at least one most significant bit and said at 
least one middle bit; said first reduced modified generate function is the logical OR 
10 of a modified generate function for the at least one middle bit and the function X for 
the most significant bits, where the modified generate function for the at least one 
middle bit is high if a carry is generated on adding the at least one middle bit plus 
one and low if not. 

1 5 Another aspect of the present invention provides a method of designing a logic 

circuit for generating a carry bit or sum bit from the combination of two j -bit binary 
inputs, the method comprising: performing a first parallelisation of the function Gj.| : o 
for generating the carry in accordance with a first relationship G a:c = D a: b (X a: b+ Gb- 
i:c) to generate a parallelised function Dj.i* (Xj_i :k + Gk-i : o), where G represents a 

20 generate function for a group of bits from j-1 to 0 or from k-1 to 0, D represents a 

logical OR of a generate function and a propagate function for a group of bits from j- 
1 to k, and X represents a function which is high if a carry is generated out of the j-1 
to k bits, low if no carry is generated at any bit position in the j-1 to k bits, and in a 
don't care state if a carry is generated at some bit position in the j-1 to k bits but no 

25 carry is generated out of the j-1 to k bits; performing a second parallelisation of the 
generate function of the parallelised function using a parallel prefix method to 
generate a further parallelised function; and designing a logic circuit in accordance 
with the further parallelised function. 
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In one embodiment the method includes performing a further parallelisation of the 
further parallelised function using the first relationship to parallelise the generate 
function for a group of least significant bits. 

5 In one embodiment the method includes performing a further parallelisation of the 
further parallelised function using a parallel prefix method to parallelise the further 
parallelised generate function for a group of least significant bits. 

In one embodiment the method includes repeatedly performing further 
10 parallelisations of the further parallelised function using alternately the first 

relationship and a parallel prefix method to parallelise the generate function for a 
group of least significant bits. 

In one embodiment the method includes performing a parallelisation of D using a 
1 5 third relationship D a:c = D a: b (X a: b+ Dt>-i :c ) to generate a further parallelised function 
for use in the logic design. 

In one embodiment the method includes performing a further parallelisation of D in 
the further parallelised function using a parallel prefix method. 

20 

In one embodiment the method includes repeatedly performing further 
parallelisations of D in the further parallelised function using alternately the third 
relationship and a parallel prefix method to parallelise D. 

25 In one embodiment of the present invention the method includes using at least one 
multiplexer in conjunction with logic for performing the further parallelised 
functions. 

The present invention allows for a greater degree of parallelisation than in either 
30 Ling or IBM, thus speeding up the computation of carry and/or sum bits. 



M&C Folio: USP86939X 
SLWK J 365.063 US I 



Embodiments of the present invention will now be described, by way of example 
only, with reference to the accompanying drawings, in which: 

Figure 1 shows a prior art logic circuit for generating a carry bit using single bit 
5 generate and single bit propagate functions; 

Figure 2 shows a prior art logic circuit for generating a carry bit using the Parallel 
Prefix Method; 

10 Figure 3 shows a prior art logic circuit for generating the most significant carry bit in 
a 9 bit addition, using the base 3 Parallel Prefix method; 

Figure 4 shows a prior art logic circuit for generating a carry bit using the Ling 
method; 

15 

Figure 5a shows a prior art logic circuit for generating the most significant carry bit 
in a 9 bit addition, using the Ling method combined with the base 3 Parallel Prefix 
method; 

20 Figure 5b shows a prior art logic circuit in which the Ling method is used to move an 
XOR gate off the critical path; 

Figure 6 shows a representation of the data structure of a j+1 bit addition, and the 
derivation of intermediate functions X j:k , Dj :k , Gk-i : o and output G j: o; 

25 

Figure 7 shows a logic circuit according to an embodiment of the invention in which 
the functions X j:k , D j:k , G k . ): o are implemented using logic gates and combined to 
produce an output of Gj : o; 

30 Figure 8a shows a representation of the data structure of a j+1 bit addition, and the 
derivation of intermediate functions Xj :k , D j:k , Ih-uo and Dj :0 ; 
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Figure 8b shows a logic circuit in which the factorisation D n -2:k [X n -2:k + Gk-i : o ] is 
used to allow an XOR gate to be moved off the critical path; 



5 Figure 8c shows a logic circuit in which the factorisation D n -2k' [X n -2:k' + DkM* ] [X n . 
2:k + Dk-i:0 ] is used to allow an XOR gate to be moved off the critical path; 

Figure 8d shows a representation the data structure of a n bit addition, and the 
derivation of intermediate functions X n .i :k + G k -j :k ', X n .i :k + G k -i : k' 5 and Pk-i:k'Dk*-i:k" 
10 andD n _i :k ; 

Figure 8e shows a representation the data structure of a n bit addition, and the 
derivation of intermediate functions X n .| :k , X k -i :m + G m _i :k ', X k M:m* + G m >-uo, and P m _ 
i:k'Dk*-i:nr and D n -i :m ; 

15 

Figure 8f shows a representation the data structure of a n bit addition, and the 
derivation of intermediate functions X n -i ;k , X k _i :k ', X k '-i :m + G m .i : o and D n -i :m ; 

Figure 9 shows a logic circuit according to an embodiment of the invention in which 
20 the functions Dg : 5> Bg : 5, G^o are implemented using logic gates and combined to 
produce an output of G 8:0 ; 

Figure 10 shows a logic circuit according to an embodiment of the invention, in 
which the functions B 8:5 , G 4:3 , P 4:3 and G2 : o are implemented using logic gates and 
25 combined to produce an output of Gg : o; 

Figure 1 1 shows a logic circuit according to an embodiment of the invention, in 
which the functions B 8: 6, B 5: 5 + G43, P4:3D2 : 2 and B 2: 2 + G2 : o are implemented using 
logic gates and combined to produce an output of G8:o; 

30 
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Figure 12 shows a ternary tree implementation of a final carry generator on a 9-bit 
adder, in which the term Dg : 5 is generated using already pre-formed building blocks; 

Figure 13 shows a representation of the data structure of a n bit addition, and the 
5 derivation of intermediate functions P n -i : k, Pic-KnAn-i**, Pk'-i:m'D m *.i : o, B m .i :k ' + G^. 
i :m ' and D n -i :m ; 

Figure 14a shows a representation of the structure of functions calculated at different 
levels of an adder according to an embodiment of the invention; 

10 

Figure 14b shows a representation of the structure of functions calculated at different 
levels of an adder according to an embodiment of the invention; 

Figure 14c shows a representation of the structure of functions calculated at different 
1 5 levels of an adder according to an embodiment of the invention; 

Figure 14d shows a representation of the structure of functions calculated at different 
levels of an adder according to an embodiment of the invention; 

20 Figure 1 5 shows a representation the data structure of a n bit addition, and the 

derivation of intermediate functions X n .i :k , X k .i :m -h G m .i :k ', X k '. 1:m ' + G m >_i:k", X k ". 

I:m" + Gm".|:0, Pm-l:k'Dk'-l:m' 5 Pm'-l:k"Dk"-I :m ' and D n _i :m ; 

Figure 16 shows a representation the data structure of a n bit addition, and the 
25 derivation of intermediate functions P n _i :k , P k -i :m D m _| : k', PkMan'IV-i*", Pk"-i:m"D m ". 

1:0, B m -i :k ' + G k M:m' and B m '-I : k" + Gk'MinT; 

Figure 17 shows a representation the data structure of a n bit addition, and the 
derivation of intermediate functions X n .i :k , X k .| :k -, X k >.| :m + G m -i:k", X k "-i:m' + G m .| : o, 
30 P m _i:k"Dk"-i:nr, and D n . 1:m ; and 
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Figure 18 shows a logic circuit according to an embodiment of the invention, for a 
16 bit adder. 

As a first embodiment to the invention there is disclosed a method which allows for 
5 the carry to be formed as a combination of functions, which can be computed in 
parallel, each of which is more complex than a simple single bit-level propagate. 

An embodiment of the invention will now be illustrated by way of example. 
Consider 

10 

G 4: 0 = g4 + P4g3 + p4P3g2 + P4p3P2gl + P4P3P2plgO 

Ling's approach breaks this as 

1 5 G 4: 0 = p4[g4 + g3 + P3g2 + P3P2gl + p3p2plgo] 

The inventors have observed that the delay of the carry term can be reduced 
significantly by increasing the delay of some other term by more than a simple bit- 
level propagate. For example: 

20 

G 4: 0 = [g4 + P4P3][g4 + g3 + g2 + P2gl + P2Plgo] 
G 4 :0 = [g4 + P4g3 + P4p3P2][g4 + g3 + g2 + gl + Plgo] 

25 • The inventors have further observed that the delay of the carry term can be reduced 
significantly by increasing the delay of some other terms, rather than just one. For 
example: 

G 4:0 = p 4 [g 4 + P3][g4 + g3 + g2 + P2gl + P2Plgo] 

30 

G 4:0 = p 4 [g4 + g 3 + P3P2][g4 + g3 + g2 + gl + Plgo] 
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G 4 :0 = [g4 + P4P3][g4 + g3 + P2][g4 + g3 + g2 + gl + Plgo] 



In the following a Logic unit indicating carry generation from addition of 2 numbers 
5 aj. . .a k and bj. . .bk plus 1 will be denoted by: 

D j;k = Gj :k + Pj:k 

The inventors have observed that in general 

10 

Gj:0 = Dj :k [Xj;k + Gk-J:o] 

Thus a method and apparatus are disclosed, as shown in Figure 6, for a Logic unit 
indicating carry generation of addition, the input bits being divided into two groups, 
1 5 least significant and most significant bits, in which: 

Gk-i:o denotes a logic unit indicating carry generation from addition of least 
significant bits. 

Dj:k denotes a logic unit indicating carry generation from addition of most significant 
20 bits plus 1. 

Xj :k denotes a logic unit which is high when a carry is generated out of the most 
significant bits, and is low if no carry is generated at any bit position in the most 
significant bits. The unit is in a don't care state if a carry is generated at some bit 
position but no carry is generated out of the most significant bits. 

25 

The outputs of the above three logic units are combined in a logical unit for 
generating Gj : o- 

For 2-bit addition, the following Karnaugh maps respectively illustrate the logic unit 
30 X, logic when a carry is generated, and logic when no carry is generated at any bit 
position. 
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aibi/aobo 


00 


01 


11 


10 


00 


0 


0 


Don't care 


0 


01 


0 


0 


1 


0 


11 


1 


1 


1 


1 


10 


0 


0 


1 


0 



aibi/aobo 


00 


01 


11 


10 


00 


0 


0 


0 


0 


01 


0 


0 


1 


0 


11 


1 


1 


1 


1 


10 


0 


0 


1 


0 



5 



aibi/aobo 


00 


01 


11 


10 


00 


0 


0 




0 


01 


0 


0 




0 


11 


1 


1 




1 


10 


0 


0 




0 



The inventors have observed that the simplest implementation of the Xj :i unit is 

B j:i = gj + gj-l + ••■+&+!+& 



10 Thus, as illustrated in Figure 7: 

Gj : o = Dj:k[Bj:k + Gk-| : o] 

Example: 

15 

G8:0 = D 8: 5[B8:5 + G 4: o] 
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The inventors have observed that logic indicating cany generation of the addition of 
two numbers plus 1 can be parallelized as: 

5 D j: o = Dj :k [Xj :k + Dk-|:o] 

Thus a method and logic circuit are disclosed, and illustrated in figure 8a, for carry 
generation in for example addition, in which the input bits are divided into 2 groups, 
least significant and most significant bits. The logic circuit comprises: 

10 

A logic unit indicating carry generation from addition of least significant bits plus 1, 
denoted by E>k-i:0- 

A logic unit indicating carry generation from addition of most significant bits plus 1, 
denoted by Dj :k . 

15 A logic unit which is high when a carry is generated out of the most significant bits, 
and is low if no carry is generated at any bit position in the most significant bits. The 
unit is in a don't care state if a carry is generated at some bit position but no carry is 
generated out of the most significant bits, denoted by X j:k . 
A logical unit for combining outputs of the 3 logic units. 

20 

The inventors have observed that in such a parallelization the simplest 
implementation of the X j: j unit is 

Bj:i = gj + gj-l + ...+gi+l+gi 

25 

Thus D j:0 = D j:k [B j:k + D k .,:o] 

The inventors have further observed that further parallelization of carry generation 
can be achieved by repeated use of parallelizing D 

30 

Gj:0 = Dj :k '[Xj :k ' + D k >_i :k ][Xj :k + G k .| : o] 
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Example: Gi 5: o = Di 5 :8[Bi 5 :8 + D 7 ;4][Bi 5 :4 + G 3:0 ] 



An adder does have the problem that to produce the actual carry out the logical AND 
5 of D and X + G needs to be formed which would impact the delay. This extra delay 
can however be eliminated by noting that the critical path for the n th bit of an adder 

is: 

S n -1 = a n -I © b n -l © G n -2:0 
10 =a n -l © b n -l © D n -2:k[X n -2:k + Gk-l : o] 

By choosing an appropriate k, D n -2:k can be computed faster than X n -2:k + Gk-i : o and 
so a multiplexer can be used. This is illustrates in figure 8b. 

15 S n -l = (a n _i © bn-l © Dn-2:k) [X n -2:k + Gk-ko] 
+ (a n -l © bn-l) [X n . 2 :k + G k -l:o] C 

This method can be applied to the invention when G is produced as a combination of 
more than two functions, by using more than one multiplexer as illustrated in figure 
20 8c. 

Sn-l = ((a n -l © b n -l © D n -2:k0 [X n -2:k' + Dk'-l:k] 

+ (a n -l © b n .i) [X n _2:k' + Dk'-I:k]°) [X n -2:k + Gk-ho] 
+ (a n -i © bn-l) [X n -2:k + Gk-iaf 

25 

In another embodiment of the invention the inventors have realised that the 
parallelization of carry generation as disclosed above can be combined with the 
parallelization provided by the parallel prefix method to determine carries or 
building blocks in a tree like structure and provide further speed up of carry 
30 generation. The inventors have further realized that there are several methods for this 
combination, the best combination depending on type of technology being used, for 
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example static CMOS and dynamic circuits among others. The best combination will 
become apparent to those skilled in the art. 

This method of combining will now be illustrated. 

5 

The parallel prefix method provides the following parallelizations: 

Gj:i = Gj :k + Pj:kG k _]:i 
Dj:i = Gj: k + P j:k D k . I: i 

10 

To implement G n .| : o we can first parallelize using the method disclosed above: 

G n -l:0 = Dn-l:k[X n -l:k + G k _i : o] 

15 The inventors have observed that since X n -i :k + G k .i : o is an OR of two terms and the 
parallel prefix method provides a means of parallelizing G k .i : o also as an OR of two 
terms, it is well known that an OR-OR combination can be reduced to a single OR 
combination and in many technologies gates with 3 or 4 inputs can be implemented 
efficiently, the combination of the two methods results in: 

20 

X n -l :k + G k -i:0 = X n _i :k + G k _| :k > + Pk-l:k'Gk'-l:0 

The inventors have realized that further parallelization can be achieved by 
parallelizing G k M:o by the method of the current invention to get an AND- AND 
25 combination which can be reduced to a single AND combination and the efficiency 
of larger input gates can be used. 

Thus parallelizing G k >.|:0 as 

30 G k '-i:0 = Dk>-l :k "[Xk'-l:k"+Gk''-l:o] 
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We arrive at: 



X n -l:k + Gk-1:0 - [X n -l:k + Gk-l:k'] + [Pk-l:k' Dk*-l:k"][Xk'-l:k" + Gk"-l : o] 

5 This is illustrate in figure 8d. 

The benefits of this method will now be illustrated by way of example: 

G7:0 = D7 : 6[B7 : 6 + G 5: o] 
10 B 7:6 + G 5:0 = [B 7:6 + G 5:4 ] + P 5 :4G 3 :0 

= [B 7:6 + G 5:4 ] + [P 5 :4D3:2][B 3 :2 + G, :0 ] 

The building blocks are given by: 

15 D 7 :6=g7 + P7P6 

B 7:6 + G 5:4 = g7 + g6 + g5 + P5&4 
B 3:2 + G i : o = g3 + g5 + gl + PlgO 
P5:4D 3:2 = P5P4[g 3 + P5P2] = P5P4P 3 [g 3 + P 2 ] 

20 We now compare this to Ling's method: 

G 7:0 = P7 H 7: o 

H 7: o = H7 : 4 + P 6:3 H 3: o 

25 In which the building blocks are given by: 

P6:3 = P6P5p4p3 

H 7:4 = g7 + g6 + P6g5 + P6P5g4 
H 3:0 = g 3 + g2 + P2gl + P2PlgO 

30 
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Notice that B 3:2 + G j : o is simpler than H 3:0 , B 7:6 + G 5:4 is simpler than H 7: 4 but 
P 5: 4D3:2 and D7;6 are more complex than P 6: 3 and p 7 respectively. But the critical path 
of the current method is shorter. Moreover, the implementation according to this 
embodiment of the invention has less fan-out, p 6 has a fan-out of three in Ling's 
5 method but the maximum fan-out of a signal in the current embodiment is two. 

The method disclosed above applies to binary trees. It will now be shown that further 
speed up of carry generation can be achieved by combining more that two terms. The 
method and apparatus will be illustrated by way of ternary trees. 

10 

We first illustrate Ling's method on ternary trees and point out the shortcomings. 
The starting point of this method is to parallelize carry generation as: 

15 G n .i : o = Pn-l H n -1:0 

Then 

H n -1:0 = gn-1 + G n -2:0 = gn-1 + G n -2:k' + P n -2:k* G k M:k" + Pn-2:k' Pk'-l:k" G k ».| :0 

20 

By applying G j:i = pj H j:i to G k >_i :k - and G k ».| : o we have 

H n -i : o = H n -i:k' + Pn-2:k'-lH k >_i :k » + P n -2:k , Pk , -l:k"-lH k »_i : o 

25 This has the form A + BC + DEF 

But notice that although H is a little simpler than G this method offers no advantage 
over the parallel prefix method when three of the H's are combined. Also note that 
although ternary trees have fewer levels than binary trees, in this method each level 
30 is much more complex then the binary tree parallel prefix method. Moreover very 
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high fan-out results. Thus the ternary tree method offers little if any advantage over 
the prior art binary method. 

Method and apparatus are now disclosed which overcome these shortcomings. We 
5 divide the n input bits into three segments, as shown in figure 8e: 
[n-l,k], [k-l,k'] and [k'-l,0]. 

By choosing an m lying in the middle segment we have: 

10 G n -|:0 = D n -l:m[X n -l:m + G m _i : o] 

We now consider 

X n -l:m + G m -I:0 = X n -| :m + G m -i : k' + Pm-l:k v Gk*-I:0 

15 

By choosing an m' lying in the third segment and parallelizing Gk'-i : o as Dk'-i:m'[Xk'- 
i :m ' + G m »_i:o] according to the method of one embodiment of the current invention 
we have computed X n -i :m + G m -i : o in terms of three smaller terms of the same form. 

20 X n .| :m + G m -i : o = X n -l : k + [Xk-l:m + G m -I:k'] + [P m -l:k* Dk'-lrm'] [XkM :n T + Gm'-ko] 

Note that this logic combination is a simple K 2 + Ki + QoKo compared to H 2 + P2H1 
+ P2P1H0 for Ling's method. This has been achieved at the expense of a more 
complex D since it ranges from n-1 to m. Those skilled in the art can choose an 
25 appropriate m such that the critical path for the two units D n .i ;m and [X n -i :m + G m .| : o] 
is balanced in a manner that results in faster carry generation depending on the 
technology. 

The inventors have observed that this method is very advantageous in Field 
30 Programmable Gate Array technology. It is known that in this technology LUTs 

(Look-Up Tables) (i.e. look-up table based FPGAs) are provided which can compute 
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any logic function, a very common choice for the number of variables which can be 
input to an LUT is four variables. It is noted that K2 + Ki + Q0K0 is a function of 
four variables where as H 2 + P2H1 + P2P1H0 is a function of five variables. 

5 Notice that if m had been chosen to lie in the third segment, then D n -i :m would have 
become more complex still but resulting in simpler: 

X n -1:m + G m .i : o = X n _i :k + Xk-l;k' + [Xk'-l:m + G m .j:o] 

10 This is illustrated in figure 8f. 

Note that the parallelization of D n -i :m can be carried out in a similar manner. This is 
illustrated in figure 8f. 

15 Figures 9, 10 & 1 1 show a sequence of steps to derive the ternary tree 
implementation of the final carry generation in a 9-bit adder. 

G 8: o = D 8 :5[B 8:5 + G 4: o] (Figure 9) 

= D 8 : 5 [B 8:5 + G 4:3 + P 4:3G 2: o] (Figure 10) 

20 = D 8:5 [B 8:6 + [B 5:5 + G 4:3 ] + P 4:3 D 2:2 [B 2 :2 + G, :0 ]] (Figure 1 1) 

The inventors have observed that logic can be shared in the implementations of D n . 

l :m and [X n -l:m + G m -l : o]. 

25 This is now illustrated by way of example. 

D 8: 5 = G 8:6 + P 8:5 

= P8 [B 8:8 + G 7 :6 + P7:5] 
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and Bg : 8 + Gj* is a suitable Xg* which can replace Bg : 6 in the above parallelization of 
Gg : o. This sharing of logic allows for the reduction of silicon area. The complete 
parallelization of Gg : o according to the invention is shown in figure 12. 

5 We have thus far disclosed how a term of the form X n -i :m + G m .\ : o can be constructed 
out of terms over a smaller range, for example in the ternary tree method. 

X n -l:m + G m -1:0 = X n -l :k + [X k .i :m + G m -i :k '] + [P m -i :k ' Dk'-l:m'] [X k M:m' + G m M : o] 

10 Note that each of the terms X n _ 1:k , X k _ l:m + G m . 1:k >, X k '.i:m' + G m M:0 can be 

constructed in the same manner from terms over even smaller ranges. We have 
disclosed a recursive method for forming X + G over a range in terms of X + G over 
a smaller range in a tree structure. However this involves the PD term. A method is 
now disclosed for forming the PD term over a range in terms of X + G and PD terms 

1 5 over a smaller range. 

Before illustrating this method we fix some notation: 

By underlining a logic unit we mean the non-underlined logic unit but with 
20 complemented inputs. 

The inventors have observed the following relationships between Gp and D j:i . 
25 Gj;i c = 

Ofci C = D|:i 

0u = V 

Also it is easy to see that 
30 Pp^ 

P j:i° = ^ 
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Pja° ~ B j:i 
PLi = B j:i c 

Thus 




=Pn-l:k[Pk-l:mD m -l:k'] ([B m . I:k ' +G k >_ I:m '] + [Pk'-l:m'D mM:0 ]) 

10 

Note that this logic combination is a simple Q2Qi(Ko + Q 0 ). The process is illustrated 
in figure 13. We have now disclosed a method and apparatus for recursively 
constructing X + G and PD in a tree structure. Figure 14a shows the tree structure for 
the carry out of a 27-bit adder. Figure 14b shows the tree structure for the carry out 
15 of a 27-bit adder in a form from which those knowledgeable in the art can derive a 
silicon layout of an adder. Figure 14c shows the tree structure for the carry out of a 
32-bit adder according to the present invention. Figure 14d shows the tree structure 
for the carry out of a 32-bit adder to aid the layout process. 

20 

The inventors have observed that this method is very advantageous in Field 
Programmable Gate Array technology. It is known that in this technology LUTs are 
provided which can compute any logic function, a very common choice for the 
number of variables which can be input to an LUT is four variables. It is noted that 
25 Q2QKK0 + Qo) is a function of four variables. 

The inventors have observed that this method can be applied to higher order trees 
such as quaternary, quintic and so forth. The quaternary method will now be 
illustrated. 

30 

We divide the n input bits into three segments: 
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[n-l,k], [k-l,k'] ? [k'-l,k"] and [k"-l,0]. By choosing a suitable m in the second 
segment we can parallelize G n .] : o as 

G n -1:0 = D n _i :m [X n .| :m + G m -l:o] 

5 

We now construct X n -i: m + G m _i : o out of four smaller segments. Appropriate m' and 
m" are chosen in the third and fourth segments respectively. The G m -] : o is 
parallelized according to the parallel prefix method. 

10 X n _i :m + G m .i : o = X n _i :m + G m -i :k ' + P m -l:k' Gk'-l:k" + P m -l:k' Pk'-l:k" Gk"-1:0 

The terms Gk'-i*" and Gk"-i : o are now parallelized according to the method of the 
current invention. 

15 X n -l :m + G m _i : o = X n _i :k + [Xk-l :m + G m -i : k* ] + P m -l:k' Dk'-l:m' [Xk>-l:m' + G m >-i : k"] 

+ Pm-l:k' Pk'-l:k" E>k"-l:m" [Xk"-l:m" + G m 'M:o] 

The inventors have further observed that P m -] : k> can be replaced by 
Pm-i:k' Dk'-i:m' thus allowing for sharing of logic and so reducing area. This is 
20 illustrated in figure 15. 

The method of constructing PD in a quaternary tree can be derived as before 

Pn-l:m Dm-1:0 = Pn-l:k[Pk-l :m Dm-l:k'][[B m -l:k' + GkM:m'] + [Pk'-l:m' Dm'-lrk"]] 
25 [[B m - 1:k > + [B k '-l:k" + Gk»-l: m »][Pk»-l:m» Dm-MK)]] 

This is illustrated in figure 16 and has the form Q3CMK2 + Qi][K 2 + Kj + Q 0 ] 

As with the ternary method, m can be chosen in different segments. The further to 
30 the least significant segment results in a less complex K = X + G but a more 
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complex D. This aspect of the invention will now be illustrated. If for the 
quaternary method we choose m to lie in the third segment then: 

Xn-l:m + G m -1:0 = X n .i :k + X k _ I:k > + [X k >_i:m + G m -l:k"] + [Pm-l:k"D k ^i :m >][X k "_i :m ' + G m '_ 
5 , :0 ] 

This is illustrated in figure 17 and has the form K 3 + K 2 + Kj + Q1K0. Notice that a 
PD term can also be constructed having the form Q3Q2CMK1 + Qo] 

10 Figure 18 shows a quaternary tree implementation of the final carry generation in a 
16-bit adder This example in particular illustrates that the final Generate function is 
AND of at least three terms and the first level is not Ling. 

Gi5:0 = Di5 : 6K|5:0 = Di5 : ioJl5:6Ki5 : o 

The building blocks of the construction will now be considered. 

It has been decided in this example that the sixteen bits be divided into 4 groups and 
it is decided that the maximum complexity of the functions at the first level should 
20 be K 2 +K,+Ko+QiKo. 

K3:0 = g3 + g2 + gl +PlgO 
K 7 :4 = g7 + g6 + g5 + P5g4 
K|i:8 = gM +g!0 + g9 + P9g8 
25 K] 5: i2 = g]5 + g!4 + gl3 + Pl3gl2 

Ki5 : o = Ki5:J2 + Ki i;8 + K 7:4 + [Ps:4D3:2 ]K3 : o 

Note K 15:0 has the form K 2 + Ki + Ko + Q,Ko 

30 

P5:4D 3:2 = P5P4P3[g3+P2] 
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Which has the form Q 2 QiQo[K] +Q 0 ] 



We now construct the D term: 

5 

D)5:6 = Di5:ioJl5:6 

Jl5:6 = Ki5 : i2 + Kn;8 + Vg^D 1:6 

P9:8D7:6 = P9P8P7[g7 + P6] 

10 

We need to construct the new term: 

D]5:10 = Di5;i4[Ki5;i2 + Pl3:12Dl 1 :lo] 
Pl3:12Dn:io = Pl3pl2pll[gll + Pio] 
15 Di5:14 = Pl5 [g!5 + Pl4 ] 

The inventors have observed the parallelizations G = D 2 [X 2 + Gi] and G = G 2 + P 2 Gj 
can be used in many different combinations to derive optimal implementations 
depending on the type of technology e.g. Static CMOS, dynamic circuits etc. 

20 

The following 16-bit example shows a different logical combination, which is 
suitable for dynamic circuit techniques. It is known that in dynamic circuit 
implementations wide OR gates can be implemented efficiently but wide AND gates 
are slow in comparison. The inventors have further observed that the critical path of 
25 a 16-bit adder is in forming the a !4 © bu © Gi 4: o. The inventors have further 

observed that implementation of D results in a faster circuit than G. The inventors 
have further observed that if inverted primary inputs are available then 

aj4 © b)4 © Gi4 : o = aj 4 © bi 4 © D'i4 : Q = 314 ©' bj4 © D l4 : Q 

30 
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where ©' denotes the Exclusive NOR operation. A method is now disclosed for 
constructing Di 4: o which is suitable for dynamic circuit techniques. 

D,4:0 = D I4 :9[B,4:12 + B I1:9 + D 8:7 [B 8: 7 + G 6:5 ]+ ?Z&$*D4*[B*Z + D 2:0 ]] 

5 

The building blocks for the bracketed terms are: 

B 4 :3 + D2:0 = g4 + g3 + g2 + P2gl + P2Pl PO 
P5:5 D4:3 = P5g4 + P5p4P3 
10 Pg.6 = P8P7P6 

B 8 :7 + G 6: 5 = gg + g7 + g6 + p6g5 
D8:7 = g8 + P8p7 

15 Bn:9 = gu +g!0 + g9 
Bl4:12 = g!4 + g!3 + g!2 

Dh:9 is derived as follows: 

20 Di4 : 9 = Di4;]2[B I4 : i2 + Dii.ol 

Having building blocks: 

Bl4:12 = gl4 + gl3+gl2+gl1 
25 Dn : 9 = pi|gio +pllp!0P9 

Dj4:l2 = g!4 + Pl4gl3 + Pl4Pl3Pl2 

Thus far we have disclosed method and apparatus for a single carry generation logic 
unit. Given two n-bit binary numbers a = a n .| . . .ajao and b = b n _i . . .bibo, their sum is 
30 the n+1 bit number given by s = s n . . .sj So 
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Si = ai © bi 0 Ci 

where Ci is the carry into position i. Thus it is required that all the carries be 
5 generated. It is required that this be done with the highest speed circuit together with 
efficient silicon utilization. This will now be illustrated by way of example. We 
consider a 27-bit adder. 

G26:0 = D26:14K26:0 
10 K26:0 = B26:18 + ^\7:9 + [P|3:9D8:5]Kg:0 

K8:0 = Bg :6 + K5J + P^^O 

Kn : 9 = Bi7 : ]5 + K14.12 + Pl3:llKl!:9 

K26M8 = B26:24 + K23:21 + P22:2()K20:18 

B26M8 = B26:24 + B23:21 + B2OJ8 
15 K 2: 0 = g2 + gl +plg0 

K 5: 3 = g5 + g4 + P4g3 

Ks:6 = g8 + g7 + P7g6 

Kn:9 = gll +g!0 + Pl0g9 

Ki4:12 = gl4 + gl3 + Pl3gl2 
20 Ki7:!5 = gl7 + gl6 + Pl6gl5 

K20:18 = g20 + gl9 + Pl9gl8 

^23:21 = g23 + g22 + p22g21 

K26:24 = g26 + g25 + P25g24 

Bg:6 = g8 + g7 + g6 
25 B 17: 15 = gl7 + gl6 + gl5 

B20:l8 = g20 + gl9 + gl8 

B23:21 = g23 + g22 + g2i 

B26:24 = g26 + g25 + g24 

D26:14 = D 2 6:23K26:18 + P26:18Dl7:14 
30 D26:23 = P26 K 2 6:24 + P26P25:23 

D|7:14 = Pl7 K| 7: |5 + Pl7Pl6:14 
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Pl3:9D 8 :5 = [Pl3:llP|0:8][K 8 :6 + P7:5] 
P26:18 = P26:24P23:2lP20:18 
P 4: 2 = P4P3P2 
P7:5 = P7P6P5 
5 PlO:8 = P10P9P8 

Pl3:ll = Pl3Pl2Pll 
P20:18 = P20P19P18 
P22:20 = P22P21P20 
P23:21 = P23P22P21 
10 P26:24 = P26p25p24 

This completes the circuit for G26:0- We now present the remaining Gko. 

G25:0 = D25:14K25:0 
15 K25:0 = B25:18 + K)7:9 + [Pl3:9D8:5]Ks:0 
B25:18 = B 2 5:24 + B 2 3:2I + B20:18 
B25:24 = g25 + g24 

D25:14 = D25:23 K25:18 + P25:18Dl7:14 
K25:18 ~ B 2 5:24 + K 2 3:2I + P22:2oK20:18 
20 D 2 6:23 = P25 B 2 5:24 + P25:23 
P25:18 = P25:24P23:2lP20:18 
?25:23 = P25P24P23 
P25:24 = P25P24 P4:2 = P4P3P2 

25 G24:0 = D24:14K24:0 

K24:0 = K24:18 + K|7 : 9 + [Pn^Dg^Kgtf 

^24:18 = g24 + K23:21 + P22:2()K20:]8 

D24:14 = D24:23 K24:18 + P24:I8D]7:14 

D24:23 = g24 + P24:23 
30 P 2 4:18 = P24P23:2lP20:18 

P24:23 = P24P23 
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G23:0 = D23:14K23:0 

K23:0 = K23:I8 + Ki 7: 9 + [Pl3:9D8:5]K8 : 0 
K23:i 8 555 K 2 3:21 + P22:2()K20:18 
5 D23:14 = P23 K 2 3:18 + P23:l8Dl7:14 
P23:18 = P23:2lP20:18 

G22:0 = D22:I4K22:0 

K22:0 = K22:18 + K|7 : 9 + [Pl3:9D8:5]^8:0 
10 K22:18 = ^22:21 + P22:2oK.20:18 
^22:21 = g22 + g21 
D22:14 = P22K22M8 + P22:18Di7 : i4 
P22:18 = P22:2lP20:18 
P22:21 = P22P21 

15 

G21:0 = D2l:l4K2l:0 

K21:0 = K 2 l:i8 + Ki 7:9 + [P] 3 : 9D 8 :5]K 8 :0 
K2I:18 = g21 + P21:20K20:18 
1*21:14 = P21 K 2 1:18 + P21:18Dl7:14 
20 P2IM8 = P2lP20:18 

G20:0 = D20:14K20:0 

^20:0 = K20:18 + K173 + [Pn^DgrsJKsiO 
D20:14 = P2()K20:I8 + P20:18Dl7:14 

25 

Gi9 : o = Di9 : i 4 Ki9:0 

Ki9 : o = Ki9 : i8 + Ki7:9 + [Pl3:9D8:5]K8:0 
Ki9:18 = gl9 + gl8 
Dl9:14 = Pl9K]9:i8 + Pl9:!8Dl7:14 
30 Pi9 : i8 = P19P18 
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G)8:0 = Dl8:14Ki8 : o 

Kl8:0 = gl8 + Ki7:9 + [Pl3:9D8:5]Ks : 0 

Dl8:14 = gl8 + Pl8Dl7:14 

5 G|7;0 = Di7 : |4Ki7 : o 

K| 7 ;0 = Ki7 : 9 + [P|3 :9 D8:5]K8:0 

Gi6:0 = Dl6:14Ki6:0 
K|6:0 = Ki6:9 + [P|3:9D8:5]Ks:0 
10 Ki6:9 = Ki6:15 + Ki 4: i 2 + Pl3:llKi i :9 
Ki6:15 = gl6 + gl5 
Dl6:14 = Pl6Ki6:15 + Pl6:14 
Pl6:14 = Pl6pl5Pl4 

15 G|5 : o = Di5 : i4Ki5;0 

Ki5:0 = Ki5; 9 + [Pi3 :9 D8:5]K8:0 
Ki5:9 = gl5 + Ki4;]2 + Pl3:llKn : 9 
Dl5:14 = gl5 + P15P14 

20 Gi4:0 = Pl4Ki4:0 

Ki4 : o = Ki4 : 9 + [Pl3:9D8:5]K8;0 
K|4 : 9 = Ki4 : i2 +Pi3;i|Kii : 9 

Gi3:0 = PnKl3:0 
25 Ki3 : o = K]3 : 9 + [Pl3:9D8:5]Ks:0 
K13.9 = Ki3:i2 +Pi3:llKii :9 
Ki6:12 = gl3 + gl2 

G|2:0 ~ Pl3Ki2:0 
30 K|2:0 = Kj2:9 + [Pi 2:9^8:5^8:0 

Pl2:9D 8 :5 = [P]2:llPl0:8][K8; 6 + P7:5] 
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Ki2:9-gl2 + Pl2:llK|i : 9 
P|2:I1 = Pl2Pll 

Gii;0 = PilKn;0 
5 K|| : o = Kn : 9 + [Pll:9D8:5]K 8 :0 

Pl1:9D 8 :5 = [pilP|0:8][K8:6 + P7:5] 
Kn:9 = PuK||:9 

GiO:0 = PloKjO:0 
10 KiO:0 = K|0:9 + [PlO:9D8:5]K 8: o 
PlO:9D 8:5 = PlO:8[ K 8:6 + P 7 :s] 
Ki0:9 = glO + g9 

G9:0 = p9K9;0 
1 5 K9:0 = g9 + [P9D8:5]K8:0 
P9D 8:5 = P 9:8 [ Ks :6 + P7:5] 
P9:8 = P9P8 

G 8: o - D 8: 5K 8: o 

20 D 8:5 = p 8 K 8:6 + p 8 P 7:5 

G7:0 = D7;5K7:0 

K7 : o = K7;6 + K5 : 3 + P^K^O 

D 7:5 = K 7:6 + P 7 : 5 

25 K 7:6 = g 7 + g6 

G6:0 = D6:5K6:0 
K6:0 " g6 + K 5: 3 + P4:2K 2: 0 
D 6:5 = g6 + P6:5 
30 P 6:5 = P6P5 
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G 5: 0 - P5K 5: o 

Ks;0 = K53 + P4:2K.2:0 



G4 : o ~ G4:3 + P4:2K2:0 
5 G 4: 3 = g4 + P4g3 

G 3 :0 = g3+P3:2K 2 :0 
P3:2 = P3P2 

10 G2;0 = P2K2:0 

Gi: 0 = gl +PlgO 

Go:0 = gO 

15 

The present invention is not limited to use for addition and subtraction, but may also 
have other applications. For example, two numbers may be compared by generating 
the most signifcant carry bit for the difference between the two numbers. It may not 
be necessary to generate the other carry bits of this subtraction, or to actually 

20 perform the subtraction. It may also not be necessary to take all of the least 

significant bits of the two numbers into account when performing a comparison - the 
number may, in effect, be rounded up or down before performing a comparison. 
Thus, it is not essential that a circuit according to the invention should input all of 
the least significant bits of the two input numbers, in order to generate a carry bit and 

25 perform a useful comparison of the numbers. 

It is not essential that the two input numbers a and b have the same number of digits. 
If they do not, then either leading zeros may be added to the smaller number if 
necessary, or the hardware may be hardwired to set generate functions to zero for the 
30 most significant digits which are only present in one of the numbers, and set the 
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propagate functions in the corresponding column to equal the value of the other input 
number bits. 

The above generally describes a logic circuit for generation of a carry or sum bit 
5 output by combining two sets of binary inputs, the logic circuit comprises bit level 
carry generate and propagate function logic for receiving the binary inputs and for 
generating bit level carry generate and propagate function bits for said binary inputs 
by respectively logically AND and OR combining respective bits of said binary 
inputs; first logic for receiving bit level carry generate and propagate function bits 

10 for a first group of at least three most significant bits of said binary inputs to 

generate a high output if a carry is generated out of the first group of most significant 
bits of said binary input or if said carry propagate function bits for the most 
significant bits are all high; second logic for receiving bit level carry generate and 
propagate function bits for said binary inputs to generate a high output if any of said 

1 5 carry generate function bits for the most significant bits are high or if a carry is 
generated out of a second group of least significant bits of said binary input; and 
combining logic for generating the carry or sum bit output by combining outputs of 
said first and second logic. 

20 Although the present invention has been described with reference to specific 

embodiments, it will be apparent to a skilled person in the art that modifications lie 
within the spirit and scope of the present invention. Any documents referred to 
above are hereby incorporated by reference for any purpose. 

25 
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